
pCT world vau^SgggeSi " ATON 

L n rl unci> THF PATENT COOPERATION TREATY^PCT)__ 
.NTERNATIONALAPP^ 

■ : ■ ■ .. I f(ll) International Publication Number. 

( 43) international Publication^^ 



(51) International Patent Classification 6 : 
C07H 21/00 



A2 



(21) International Application Number: 

(22) International Filing Date: 



PCT/US99/ 13479 
15 June 1999 (15.06.99) 



(30) Priority Data: 
60/089,649 



17 June 1998 (17.06.98) 



US 



ni) Applicant: MAXYGEN, INC. [US/US]; 515 Galveston Drive, 
Redwood City. CA 94063 (US). 

Francisco, CA 941 11 (US). 



<"> %« S cTc„^% AM ^ I 
Smx, no. nz. p l pt, i ™ s % S& za*$ 

ZW), Eurasian paten, (AM. AZ BY KG, l^ M 

£f. Bj! CP.' CG, CI. CM. GA, GN, GW, ML. MR. NE. 
SN, TD, TG). 



upon receipt of that report. 



(57) Abstract conferring a desired phenotype 

t . Ht f „ th . nroduction of polynucleotides with a desired P r °^^ c ^' hod inc . udes making insertions 

and/or deletions at random sites in DNA segments in a poy 



recursively. 



FOR THE PURPOSES OF INFORMATION ONLY 



At 


Albania 


AM 


Armenia 


AT 


Austria 


AU 


Australia 


AZ 


Azerbaijan 


BA 


Bosnia and Herzegovina 


BB 


Barbados 


BE 


Belgium 


BF 


Burkina Faso 


BC 


Bulgaria 


BJ 


Benin 


BR 


Brazil 


BY 


Belarus 


CA 


Canada 


CF 


Central African Republic 


CG 


Congo 


CH 


Switzerland 


CI 


COte d'lvoire 


Cm 


Cameroon 


CN 


China 


cu 


Cuba 


C2 


Czech Republic 


DE 


Germany 


DK 


Denmark 


EE 


Estonia 



ES 
FI 
FR 
GA 
GB 
GE 
CM 
GiN 
GR 
HU 
IE 
IL 

IS 

IT 

JP 

KE 

KG 

KP 

KR 
KZ 
LC 
LI 
LK 
LH 



Spain 
Finland 
France 
Gabon 

United Kingdom 

Georgia 

Ghana 

Guinea 

Greece 

Hungary 

Ireland 

Israel 

Iceland 

Italy 

Japan 

Kenya 

Kyrgyzstan 

Democratic People's 

Republic of Korea 

Republic of Korea 

Kazakstan 

Saint Lucia 

Liechtenstein 

Sri Lanka 

Liberia 



LU 
LV 
MC 
MD 
MC 
MK 



Lesotho 

Lithuania 

Luxembourg 

Latvia 

Monaco 

Republic of Moldova 
Madagascar 
The former Yugoslav 
Republic of Macedonia 



ML 


Mali 


MN 


Mongolia 


MR 


Mauritania 


MW 


Malawi 


MX 


Mexico 


NE 


Niger 


NI- 


Netherlands 


NO 


Norway 


NZ 


New Zealand 


PL 


Poland 


PT 


Portugal 


RO 


Romania 


RU 


Russian Federation 


SD 


Sudan 


SE 


Sweden 


SC 


Singapore 



SN 

sz 

TD 

TG 
TJ 
TM 
TR 
TT 
UA 
UG 
US 
UZ 
VN 
YU 
ZW 



Slovenia 

Slovakia 

Senegal 
. Swaziland 

Chad 

Togo 

Tajikistan 

Turkmenistan 

Turkey 

Trinidad and Tobago 

Ukraine 

Uganda 

United States of America 

Uzbekistan 

Vict Nam 

Yugoslavia 

Zimbabwe 



WO 99/65927 



PCT/US99/13479 



METHOD FOR PRODUCING POLYNUCLEOTIDES 
WITH DESIRED PROPERTIES 

Fiflfl flf thf T"vft"'i"" 

The present invention relates to methods for the production of polynucleotides 
conferring a desired phenotype and/or encoding a polypeptide having an advantageous 
predetermined property which is selectable or can be screened for. 

flat- frgrnnnH nf thw Invention 

Traditional molecular biological methods for generating novel genes and proteins 
generally involved rational or directed mutation. An example is the generation of a 
polynucleotide encoding a fusion or chimeric protein by using known restriction sites to combine 
functional domains from two charactered proteins. Another example is the introduction of a 
pointmutationataspecificsiteinapolypeptide. Although useful, the power of these and s.milar 
methods is limited by the requ.rement for sequence or restriction map information to facilitate 
the mutagenesis, and by the limited number of variants that can be efficiently generated. 

An alternative approach to the generation of variants uses random recombination 
techniques such as "DNA shuffling" (Patten et al., 1997, Curr. Opin. Biotech. 18:724-733). 
DNA shuffling entails performing iterative cycles of recombination and screening or selection 
to "evolve" individual genes, whole plasmids or viruses, multigene clusters, or whole genomes. 
Such techniques do not require the extensive analysis and computation required by conventual 
methods for engineering of polynucleotides and polypeptides. Moreover, DNA shuffling allows 
the recombination of large numbers of mutations in a minimum number of selection cycles, in 
contrast to traditional, pairwise recombination events. Thus, DNA shuffling techniques provide 
advantages in that they provide recombination between mutations in any or all of these, thereby 
providingavery fast way of exploring the manner in which different combinations ofmutations 

can affect a desired result. 

The present invention provides methods that may be used alone or in combination 
with random recombination techniques such as DNA shuffling to generate novel polynucleotides 
having, or encoding a polypeptide having, a desired property or combination of properties. 
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Summary nf thp Ttivpntj^n 
In. one aspect, the invention provides a method of producing a DNA segment 
having a desired property or combination of properties by mutating a substrate population. The 

method involves: 

a) mutating a substrate population that includes a plurality of DNA segments by: 
0 making insertions at random sites in the segments (random insertion), 
11) making deletions at random sites in the segments (random deletion), or 

both, to produce a mutated population including mutated DNA segments, 

b) screening the mutated population to obtain a first selected population that includes 
at least one DNA segment with a first desired property, 

c) mutating the first selected population by making random insertions, random 
deletions, or both, to produce a recursively mutated population, and, 

d) screening the recursively mutated population to obtain a recursively selected 
population that includes at least one DNA segment with a second desired property. 

In some embodiments the method further includes at least one additional cycle 
of mutation and screening (e.g., mutating the recursively selected population and screening the 
resulting recursively mutated population to obtain new recursively selected population with a 
desired property) after step (d). In some embodiments, shuffling of one or a combination of 
polynucleotides in a recursively selected population is carried out. 

In various embodiments, the second desired property may be the same or different 
from the first desired property, and may be a combination of properties. In some embodiments, 
the polynucleotides in the recursively selected population have a property that is enhanced when 
compared to the polynucleotides in the first selected population. In some embodiments the 
substrate population includes DNA segments encoding a polypeptide, a catalytic RNA, a 
promoter sequence or a vector. In some embodiments the substrate population is homogeneous. 
In some embodiments a polynucleotide that encodes a polypeptide is screened for an activity 
such as an enzymatic activity, a substrate specificity, or a binding activity of a polypeptide. 

In another aspect, the invention provides a method of producing a DNA segment 
having a desired property by: 

a) mutating a first substrate population that includes a plurality of DNA segments 

by: 

i) making insertions at random sites in the segments (random insertion), 
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ii) ' making deletions at random sites in the segments (random deletion), or 
both to produce a first mutated population of mutated DNA segments; 

b) mutating a second substrate population that inc.udes a plurality of DNA segments 

by: 

i) making insertions at random sites in the segments, 

ii) ma king deletions at random sites in the segments, or both 

to produce a second mutated population of mutated DNA segments; 

c) recombining the first substrate population and the second substrate populate to 

produce a recombined population; and, 

d) screening the recombined population to identify at least one DNA segment wtth 

the desired property. 

In one embodiment, the first and second mutated populations are screened to 
produce a first and second selected population, each having a desired property, and the selected 

populations are recombined. 

In vanous embodiments, the recombination may be achieved by shufflmg or 
directed recombination. In some embodiments the first desired property and the second destred 
property are the same. In some embodiments the substrate population includes DNA segments 
encoding a polypeptide, a catarylic RNA, a promoter sequence or a vector In some 
embodiments the substrate population is homogeneous. In some embodiments a polynucleotide 
that encodes a polypeptide is screened for an activity such as an enzymatic activity, a substrate 

specificity, or a binding activity of a polypeptide. 

In another aspect, the invention provides a method ofproducing a DNA segment 

having a desired property by: 

a) mutating a substrate popu.ation that includes a plurality of DNA segments by: 

i) making insertions at random sites in the segments, 

ii) making deletions at random sites in the segments; 

or both, to produce a mutated population of mutated DNA segments; 

b ) screerdngmemutatedpopulationtoobtainaselectedpopulationthatincludesat 

least one DNA segment with the desired property; 

c) shuffling at least one DNA segment for the selected population to produce a 

recombined population; 

d) screening the recombined population for a desired property. 
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In one embodiment, the shuffling involves conducting a polynucleotide 
amplification process on overlapping segments of at least one polynucleotide from the selected 
population under conditions under which one segment serves as a template for extension of 
another segment, to generate a population of recombinant polynucleotides. 

In some embodiments the substrate population includes DNA segments encoding 
a polypeptide, a catalytic RNA, a promoter sequence or a vectors. In some embodiments the 
substrate population is homogeneous. In some embodiments a polynucleotide tha, encodes a 
polypeptide is screened for an activity such as an enzymatic activity, a substrate specificity, or 
a binding activity of a polypeptide. 

Brief De scription of thf Fi C iirf! 
Figure 1 provides a flow-diagram of an embodiment of the invention in which 
recursive steps of random insertion or deletion and screening are employed to produce a DNA 
segment with a desired property. 

Figure 2 provides a flow-diagram of an embodiment of the invention in which 
random insertion or deletion is carried out on two different substrate populations, which are then 
recombined. 

Figure 3 provides a flow-diagram of an embodiment of the invention in which 
random insertion or deletion, screening, and random recombination steps are employed to 
produce a DNA segment with a desired property. 

Detailed rVc^rjptjftn 

I. Definition.; 

The following terms are denned to provide additional guidance to one of skill in 
the practice of the invention: 

The term "shuffling," as used herein, refers to techniques for random 
recombination between substantially homologous but non-identical polynucleotides. Various 
shuffling methods are described in Patten etaL, 1997, Curr. Opin. Biotech. 8:724-733; Stemmer 
1994, Nature 370:389-391; Stemmer et al., 1994, Proc. Natl. Acad. Sci. USA 91:10747-10751- 
Zhao et al., 1997, Nucleic Acids Res. 25:1307-1308; Crameri et al.,1998 , Nature 391- 288-291- 
Crameri et al., 1997, Nat. Biotech. 15:436-438; Arnold et al., 1997, Adv. Biochem. Eng. 
Biotechnol. 58:2-14; Zhang et al., 1997, Proc. Natl. Acad. Sci. USA 94:4504-4509; Crameri et 
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al 19 96 Nat. no*** 14:315-3,9; Cramer, et a.., 1996. M* **■ 2 « ^ 
l W095/22625; WO97/20078; W097/35957; W097/35966; W098/13487, 

aisodescnbedmthefol.o^ 

US Patent Applications Senal Nos: 08/537,874; 08/621,859; 08/792,409; 
08 82 589; 09 02,769; 60/074,294; 08/722,660; 08/938,690. Each of the 

■ U purpose, One method of shuffhng. composes conducting a P^"^" 
plsloverlappingsegn^ 

whereby one segment serves as a template for extension of another segment, to generate 
popull of —ant polynudeotides, and screening or se.ecting a reco— 
olnucleotide or an expression product thereof for a desired property. Some 
Ihuffhng use random point mutations (typical.y introduced in a PGR amphftcaUon step) as a 

sh or t erthana b out50bases(e.,,a b out6,9, 1 2, 15,18,21,25, V'*"^****™* * 
t e m >.ynucleotide,"as.^^^^ 

at ,east about 60, 100, 200, 300, 500, 1000, 5000, 10,000 bases orbase pairs in length, or even 

P o>v™c.eotide(or,,,,an^^^ 
Llungsystem.mclu^^ 

25 Ltivitv), fl u^^ 

as receptor, ligand, antibody or antibody fragment, antigen, epttope, or other btolo teal 
— lecule, W ^ 

promoter strength, regulation), a sequence affecting RNA processmg (e.g., RNA stab.hty o 
30 dicing) a Intence affecting translation (e.g., level, regulation, post— pnona 
30 Illn), : a seance affecting other expresston property of a gene ~ 

relative element, a protein-binding e.ement; a vector, an encoded protetn (e.g., enzymaUc 
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. activity „, specificity ; bmding acdvi(y ^ pj> tQ dena ^ ^ ^ 

e ' 8 - 0f Ca ' alytlC ^ ^ ^ ^ Addit,0nal - descnbed herein 

111 mCOrPOrated herein ' " ^ aPP3rent 10 ~ * ^ ™^ «* 

5 The term "evolve," as used herein, refers to the process of inducing variation 

■nto a popuianon of macromoiecules and selechng or screening for acou^on of a desired 
property « to ^^ rfl ^ w ^. 

molecules different from the molecules of the starting population. 

10 II. Qvervig W 

no, , H ? C Pr£Sent inVemi0n Pr ° VideS "° VeI meth ° dS f ° r ** *««"*» of 
selectable or can be screened for). In one aspect, the invention provides methods for 
generatmg d.ve.ity i» a population of polynucleotides by random insertion or deleUon of 
seouencesan^^ 

multm.e cycles of insertion/deletion and screenmg are carried out. In some embodunents the 
propert.es of the variants are evolved by one or more of a variety of methods 

Typically the mutated polynucleotides are double stranded DNA segments 

of genes, vectors, po.ypeptide-coding sequences, expression regulatory sequences (eg 
promoters, enhancers), and the like. 

In one embodiment of the invention, a population of polynucleotides (ie a 
-^m^ is mutated by random insertion or deletion, and the resulting « 
_ is screened to identify a subpopulation of species with a des.ed property (i e a 

> sek^u^, ^ selected population . ^ itsdf mutated ^ fandom ^ 

de.et.on, and the resulting twice mutated population is agam subjected to screening to produce 
a new selected population. The second round of screening can be for the same or a simi.ar 

acqu*ed a sequence conferring chloramphenicol resistance not found in the substnate population 
and the second screen cou.d be for increased chloramphenicol resistance (the same or similar 
property), or, alternatively, m subsequent rounds of mutation and screening for the acquisition 
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» — — - ^ - 

!I" ,111, .» f,,, — - «.™ -* - — - ,sno 

intended to limit the invention in any way. 

by .andom insertion or deletion, producing corresponding mutated population. In many 
embodiments.ttetwo^^ 

(eg eachmutatedpopulation.screenedforacnfferent property). FoUowmg producUon of he 
w "or more mutated populate (or following seeing if it t*. pi-), ******* 
segments from each of the mutated populations are combined to produce a smgle recombme 

Uassical" molecular cloning technics m which a selected region m one ^popu, ^ 

ol'udeonde, "Classical" technics include (i) restriction of two population o DNA 
molecul.and^^ 

ofthe second popu.at.on, (U) ampl.ficadon of a region of one polynucleotide^ 
by PCR or inverse PCR) and ligation into the polynucleotides of the second population, (m) and 
I methods .ownmmeart. *^P^^^- ; 
propels), in some embodiments, subsequent cycles of random insert— n or 
l^ln.dscreen.garecarnedou, TO s process is outlined , Fi , .^Figure .mis 

figure is not intended to limit the invention. ,„ a ,^ hv 
in a third embodiment, a substrate population of polynucleotides » mutated by 

adesiredproperty^.a.el^^ 

isola ted from it) is then evolved by random recombination (including random 

This process is outhned in Fig. 3 ; mis figure also is not intended to limit the invention. 
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The invention will now be described in greater detail. 

In - Mutating the Snh.tmiP p^pnlntfrrn 
a) Generally 

An initial step in the method of the invention is the introduction of insertions or 
demons at random sites in a population of polynucleotides. Mutations and deletions are 
sometM.es collectively referred to herein as "mutations." For convenience, a population of 
polynucleotides into which mutations are to be introduced may be referred to as the "substrate 

population." 

Although the method can be carried out on any polynucleotides that can be 
mutated m a nmdom fashion by msertion or de.etion, as noted supra the polynucleotides wil. 
most often be DNA molecules (including cDNA), usually double-stranded DNA molecules The 
DNA molecu.es making up the substrate population may be of any of several types, including 
DNA molecules comprising polypeptide coding sequences (e.g., encoding a protem, mu.tiple 
protetns, or portions of a protein), regulatory DNAs (e.g., promoters, enhancers), vectors (e g 
an expression vector), and viruses (e.g., to produce attenuated vtrions). These DNA molecules" 
are sometimes also referred to as "DNA segments." 

The substrate population will comprise a plurality of DNA segments, typically 
at leas, .0 .more often a. least lO'.or at least 10' DNA segments. In many embodiments the 
DNA segments m any particular substrate population are identical ,o each other, being derived 
from a smgle parental DNA (e.g., p,asmid DNAs prepared from the same bacterial culture) 
Such a population is a "homogeneous" substrate population. In some embodiments, however 
the substrate population includes DNA segments tha, are not identical such as the following- 
DNA segments that differ from each other by point mutations (e.g., molecules that have been 
generated from a template using error-prone PCR) or other mutations (e.g., insertions or 
deletions); DNA segments that are related as homologs from different organisms; and DNA 
segments that are related to each other because they are products of DNA shuffling reactions 
(see, e.g., Patten et al. ( .997, Curr. Opi, Biotech. 8:724). In a related embodiment, the substrate 
population will comprise DNA segments having unrelated sequences (for example, a substrate 
population comprising several different plasmid vectors), usually with a plurality (e.g., at least 
10 or 10 6 ) of each species present. 

Mutations (insertions or deletions or both) are introduced into the DNA segments 
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* te p,««* .f bromide, »P— ,—>.«> * 



9 



WO 99/65927 

PCT/US99/I3479 

In practice, cleavage of a large population of molecules will usually result in , 

~========== 

b) Rand om Inserti ng 

uubuon^s; (e.g., a Drosophila cuticle eene TATA k« v 

random sequence 12mers). ^ 
Suitab le insertion polynucleotides may be generated by chemical synthesis, PGR 
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■ndecu.es in any of a variety of ways. They may, as will be apparent fro. the espies 
r the like Alternatively or in addition, they may act to introduce lengtn 

v to 5 4209-4218; Hall* let et al., 1997, Nuc. Acuh Res. 25.1866 1867, 

identify or eliminate nonproductive (i.e., frame srune ; V 
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i. . ««. iirf " ■* ** inv ™ ta ' - «• — - 

» 8 ~r::;:::; f *— *— - • *»~ 

When a population or polynucleotides is randomly deleted f i e Hit' 
mtroduced at random locations), there usually will h ■ • " " S 

vario.moleculesin.hepopulatL tL « fd ^ ™ " " 

vary depending i„ Ae e l f th » -V one step wil, 

P ndmg ,„ th e goals of me mvestjgator> ^ 

basepaus (e.g., a . least about 3, 6, 9 12 15 is 21 « ,« <« 65 ° f 

embodiments, however some o all eie " ^ * * «" 

base, meOrallddel,OnSma >' beIo ^.-chas a ,,eastabou. 2 00or500 

orcircu^dm^ll^ 

^edmoiecu,::!:;:::" 

— — (e,, Bal3 , or ■Crrr*"?***-*''^ 

-lecu.es are blunted by stan Jm!Z emb0dlmentS ' *" ^ 

Jh h- " A LAB0RATORy Manual 2nd ed. Vol 1-3) i n one 

==== ===== 

reduced size increases the ability to intmrf rephcafon origin). The 

When using for exam , 1 " " the ™ backbone. 

ucn using, tor example, a bacteriophage vector with a limits nxi a 

capsidcapactty),^ 

6 bacten °P h age genome would allow the packaging 
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of new or larger genes without affecting essentia, phage functions. Notab.y, the present 
invenuonaliowsreductionin the size of a vector and/or .ntroductton of genes from other sourcs 
without a prion knowledge of the function of parts of the parenta. vector. Thus, it » espec.ally 
useful when using an ..characterized bactenophage as a vector (e.g., for use ,n Stress 

bacteriophage $C31). 

As noted supra, it will sometimes be desirable, when mutating a polynucleot.de 
that encodes a po.ypeptide, to use technics to retain a reading frame found in the parental 
vector in one embodtment, for example, a s.ngle triplet is deleted from (each of) the deleted 
polynucleotides of a substrate population. This can be carried out by first mserting a re— 
cassette wh.ch may be excised (e.g., after selection) de.eting 3 nucleotide, For example, a 
cassette or short oligonucleotide contaimng a Type IIS restriction enzyme recognn.on sue ^g 
Earl Sapl) can be designed which, after random insertion can be cleaved from the circular DNA 
sofcatamultip.eofSnuc^^ 
using cre/lox) may be used to excise the resistance cassette. 

d) AtlrlitifiT" 11 Methods 

in another embodiment of the invention, a mutated population is generated from 
asubstrate population by the introduction of random insertion and/or deletions generated usmg 
processtveexonucleasedigestionoftwosubpopulauonsofpol^^ 
are then Hgated to produce novel combinations of sequences, as descnbed below. 

According to this embodiment, the substrate population may be homogeneous 
(i e a plurality of polynuc.eotides having the same sequence, e.g., having the sequence ot 
particular gene encodingap^^ 

polypeptides having related sequences, such as a family of related genes [e.g., encodrng 

Lanac^orhomologs^ 

orme productofshufflm g react^ 

Topn.duceamutatedpopulationhavingrandom insertions and/or deletions, the 

substrate population is divided into at least two subnotions. A series of nested dele,ons , 
prod uced from each of the, e.g., two ^populations by incubation with exonuclease usmg 
Lthods well known .n the art (see, e.g., HenikofT, 1984, Gene 28:35!, see a.so New England 
Biolabs Catalog 1998/99 page 129 "Exo-SizeTM Deletion Kit"). Briefly, a nuclease such a 
exonuclease IH is used to create unidirectional deletions in the polynucleotides of each 
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stipulation. Preferably, restriction endonuc.ease digestion of the DNA segments in each 
subpopulation is.used to introduce both a nuclease susceptible end (,.e., . 5' overhang or blun, 
end) and a nuclease nonsusceptible end (i.e., a 3' overhang) such that the nuclease digests in only 
one dtrection. The at least two subpopu.at.ons differ in una, the she of the nuclease susceptible 
end ,s Afferent in different subpopulation, After a series of deletions of varying .engths (i e 
nested deletions) is produced in each subpopu.ation (e.g., by incubating ahquots with 
exonuclease for differing lengths of time) polynucleotides from each subpopulation are Iigated 
«o produce a mixture of mutated polynucleotides having random tnsertions (e.g., duplications) 
and/or deletions at the junction site (a mutated population). 

An example will help to illustrate this embodiment of the invention. Thus 
confer a homogeneous substrate population of DNA segments encoding a polypeptide, which 
substrate popu.ation is divided mto two subpopulation, In one embodiment of ,he method the 
nuclease susceptib.e end in one subpopulation is introduced at the polynucleotide 'site 
corresponding to the ammo-terminus of the encoded polypeptide with digest.cn toward the 
c-termmus, and the nuclease susceptive end in the other subpopu.ation is introduced a, the 
polynucleotide site corresponding to the carboxy-termmus of the encoded polypeptide with 
d-gesaon toward the „,ermm U , For purposes of description, the two subnotions in this 
..lustra-ve example can be referred to as producmg a "ammo-terminus deleted" product or a 
carboxy-terminus deleted" product. 

After a series of nested deletions is produced in each subpopulation, 
polynucleotides from each subpopu.ation are Iigated to produce a mixture of mutated 
polynucleodd^^ 

Thus, continuing with theexampleprovidedabove, and by way of illustration, and not .imitation' 
imagtne that in each of the subpopulations deletions range from 1 base to about 99% of the' 
length of the polynucleotide (including, e.g., 5%, 10%, 90% and 95% deletions). It wil. be 
appreciated that the ligation of an ammo-terminus deleted molecule from which exactly 1 0% of 
unelen^ 

95 ^ of the length of the molecule is deleted wil. resu .t m a molecu.e that has a 5% duration 
at the hgauon junction) compared to the substrate polynucleotide sequence. Likewise the 
hgauon of a amino-terminus deleted molecule from which exact.y 5% of the length of the 
molecule is deleted to a carboxy-terminus deleted mo.ecule from which exact.y 90% of the 
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junction) compared to the substrate polynucleotide sequence. 

It will be apparent that many variations of this basic, scheme are avadable, 
mc.uding, for example, introduction of susceptible ends at sites other than those correspondtng 
to polypeptide termini. • 

„ will be appreciated that the present invention is not limited to any particular 
supra may be used. For example, self inserting DNA, i.e., transposons, may be used for in wo 

populates) for polynucleotides that have been mutated (i.e., by insertton or delet.cn> 

in a mutated population containing some mo.ecu.es, or even a substantta. proportion o 

step wi.. reduce the size of the population that must be subsequently screened. A vane^f . . 
JLdscanbeusedforenrichment. One method, the use of resistance cassettes, ^scussed 
Z- Momersuitabiememodfore— t of insert^ events ts carried out by denatunng 

by washing and the mutated mdecules are eiuted from the affinity matnx (e.g usm 
t l P erature,urea,etc). Another suitable method for enrichment involves inserttng an ^o- 

(eB theLadrepressor). After washing, polynucleotides with the tnsertton can be luted(e.g., 
"tpresenceofisopr^^ 

sponsible for binding can be excised from the polynudeotide, if destred, by a vanrty 
methods (someofwhich are discussed^;, leaving behind the sequence to be mserted. 

invol ves various techniques well known to persons of skill in the art of molecular b,ology. 
actions sufficient to dtrect persons of s.U through appropnate c.omn, s«m« 
station, random re.ombination technics, and other techniques found m, e.g., Berger and 
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Kimmel, Gu,de to Molecular Clorung Techniques, Methods n* Enzvmology volume 15. 
Academic Press, Inc., San Diego, CA; Sambrook e, a.. (,9S9) Molecular Clonic - A 
Laboratory manual (2nd ed.) Vol. .-3; and Current protocols „ Molecular B.olooy 
F.M. Ausubel et a.., eds., Current Protocols, a join, venture between Greene Publishing 
Assocates, Inc. and John Wiley & Sons, Inc., (1998 Supp.ement), and other references cited 
herein and other references known in the art. 

IV ' Screening a Mutate Pn r ..i»tinn 

Another step in the method of the present invention is the screening of a mutated 
populat.cn for a desired property. This results in the identification and isolation of or 
enn C hmen,for,DNA segments that acqu.re the desired property as a result of me mutation (f | g 
a^property),orin which an existingpropertyis desirably enhanced. Asusedhere.n the term' 
Wmng" has us usual meaningmme^^^ 

,t , determined whether a DNA segment has a particular property and in the second step the 
DNA segment^) with the property are physically separated from those not having the property 
For convenience, the population of po.ynucleotides resulting from the screen may be referred to 
as the "selected population." 

In some forms of screening, identification and physical separation are achieved 
strnultaneously. For example, identification and separation of a polynucleotide conferring drug 
reastancetoacell can be accomplished by selection of cells resistant to the drug (e.g., culturing 
under conditions in which non-resistant cells do not survive). I, will be clear from this example 
that the "separation" step of screening does notunply or require isolation of a biochemically pure' 
polynucleotide with the desired property. Rather, separation means that the DNA segment of 
•merest is separated from other DNA segments (e.g., ce.ls comprising other DNA segments) In 
some embodiments of the invention, when screening is carried out, the physical separation of 
DNA segments with the property and those without need not be absolute, and due to 
methodological limitations often is not. Thus, in some embodiments, the screening of the 
mutated population results in a selected population that isenriched for theDNA segments with 
the desired property. 

It will be immediately apparent to those of skill that screening requires an assay 
to .denttfyDNAsegments having the desired property. It will also be apparent that the specific 
assay will depend upon the particular desired property. A variety of examples are provided infra 
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t0 prov.de additional guidance to those of skill. Numerous additional screens suitab.e for use in 
the present mvention a,e described in publications and d.sc.osures describing "DNA Shuffling 
methods. Thus, the reader U referred to the patents, applications, and publications .isted » the 
Section I, supra, in the description of "shuffhng," each of which ,s incorporated here.n by 
reference in their entirety and for ail purposes. I. will be appreciated, however, the invention is 
not limited to any particular screening method. 

V Ef fusive Mi itntifin and Screening 

In one embodiment of the invention, the selected population, generated as 
descnbed supra, is mutated, i.e., insertions, deletions or both are introduced at random sites » 
the DNAsegmentsintheselected population. The type of mutation may be the same or different 
from the mutations introduced into the substrate population (,e„ the original or first substrate 
P opu.ation). For example, in a case in which random insertions were made in the substrate 
population, insertions may also be introduced in the selected popu.ation or, alternative* 
deletions may be introduced. Moreover, when insertions are made, the polynucleotide inserted 
may be the same or different from the insertion polynucleotide in the previous step. The 
resulting population of mutated DNA segments may be referred to as a "r.c^v^— 
p^^-inreferencetome^^ 

cycle of mutation by insertion and/or deletion. 

The recursivelymutated population is then screened forthe desired property. The 

population of DNA segments resulting from this screen is referred to a 

pQpulati0 n" (i.e., a "first recursively selected population"). The screen used for the selected 

population" and the "recursively selected population" may be the same or Afferent. In 

Ibodimentsmwmchmes^esc™^ 

identify DNA segments with increasingly robust properties. For example, if the deseed property 
is the ability (of a DNA segment) to confer drug resistance to a cell, the second or su sequent 
screening assay may ^W^cc^^^^^^^^^ 
of the mutated population). As another example, if the desired property is the ability of a DNA 
segment to encode a polypeptide that is bound by a particular antibody, increasingly stringent 
binding conditions may be employed in screens. 

As illustrated in Fig. 1, additional cycles of mutation and screening may be earned 
ou , if desired. Generally, from 1 to 50 additional cycles will be carried out, more often from 
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. abou, 3 to about ,0 additional rounds. In cases in which additional cycles of mutation and 
screening are carried out, it is convenient to refer to the resulting selected populations as the 
"second recursively selected population," the "third recursively selected population," etc. 

As is evident, each of the recursively selected populations contain DNA segments 
wththedestred property. Although in some cases the population as a whole will be useful more 
often a particular spec.es of DNA segment will be isolated from the popuiation and used. 

In a related embodiment of the invention, random insertions or deletions are 
unreduced mto two (or more) different substrate populations and sequence elements from each 
populate are combined by directed recombination or random recombination (e.g shuffling) 
Typically, afferent insertion sequences are introduced into each of the substrate populations 
One or each of the mutated substrate populations may be subjected to screening or selection for 
a particular property conferred by the mutation of that population, prior to the recombmation of 
the substrate popu.ations. Whether or not screening of the mutated substrate populations is 
undertaken, the recombined population will be subjected ,o screening/selection for the desired 
property or combination of properties. 

As noted, random recombination methods include DNA shuffling techniques 
Shuffhng can be earned out in conjunction with the introduction of point mutations (e.g by 
error-prone amplification), or without introduction of point mutations (e.g., by the use of 
proofreading polymerases). In contrast, "directed recombmation," or subcloning, refers to 
methods of recombination that require knowledge of the restriction map of at least part of each 
substrate population and result in the insertion of a restriction fragment from one population in 
to a particular restriction site in the second population. Examples include the insertion of 
parucular restriction fragments (by restriction and ligation) or PGR amplicons (usually by 
hgation or SOE-PCR ["spacing by overlap extension- PCR"]) derived from one substrate 
population mtoaspecific site or location in the second substrate population, and ligationoftwo 
randomly linearized substrate populations, 

VIL Random Kecnmhinatm n „f tnf » SeWt^ Pn niilntinn 

In a different embodiment of the invention, the selected population (described in 
§111, supra), a recursively selected population (described in §V), or a DNA segment species 
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recombination and pomt Ration, e.g., DNA shufflmg. It wi„ be understood « 
recombinationrefers to recombination m ethods other than d^ted exchange of spe.f.c define 
fences (e.g., the transfer of a sequence fro, one portion of DNA segments to a secon 

mSectionVI,^. Ran^^ 

poo. of DNA fragments by random fragmentation of a sing.e DNA sequence or a family 
elatedDNA sequences, and the reassemb.y ofthe fragments in various combmations to produce 
DNA segments with a new structure (,e., new combinations of deletions, msertions and/or 
introduced point mutations) and with the desired property. 

Recursive random recombination or non-recursive random recombmatum 

of fragmentation, recombination, and screening (e.g., at least 2, sometimes at least 5 cycles). 
Typically, when a random recombination method is applied to a single DNA segment from a 
sel ected population, a recursive recombination method will be used, e.g., Zhang et al. 1 9 
Proc Nat, Acad. Sci.94A50A. When a population of different DNA segments are used, both 
recursive and non-recursive recombination methods (i.e., a single cycle of fragmentaUon, 

Vin ' EX£mPl Ce^ 

present disclosure. 

Exemplary Application!: Changing Jromot f rSnaificitv 

In one embodiment, the methods of the invention are used to evolve a 

characteristics of the regulatory sequence, such as inability, tissue speafictty, orpromo* 
strength are changed. The use of the methods of the invention is particularly powerful for th 

different combinations of modules (or Terences in relative orientation) confuting to 
regulatory activity/function in unpredictable ways. 
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Typically the mutation and screening of a promoter sequence is carried out using 
a vector (e.g., an expression vector) in which the target promoter is operabiy linked to a reporter 
gene (i.e., a gene encoding a gene product that can be conveniently assayed). Many suitable 
reporter genes are well known in the art, including the green fluorescent protein (GFP), 
luciferase, P-glucuronidase, P-galactosidase, and secreted alkaline phosphatase. An advantage' 
of using a promoter-reporter system is that a change in promoter function can be easily detected, 
facilitating a variety of simple screening methods. Once the promoter sequence is evolved by 
the present method to have the desired property or combination of properties, the promoter region 
can be cloned into a different vector (e.g., to drive transcription of a gene of interest other than 
the reporter gene). Alternatively, the reporter-gene sequence can be removed from the mutated 
vector and a different gene of interest inserted in its place. Methods for subcloning a promoter 
or coding sequence in a vector are well known to those of skill in the art (see, e.g., Ausubel et 
al., supra). For example, the mutated promoter can be amplified by me polymerase chain 
reaction and the amplified sequence cloned into a region upstream of a selected coding sequence. 

Thus, in one exemplary embodiment of the invention, (1) the substrate population 
is a population ofDNA segments having a particular promoter activity (e.g., the ability to direct 
transcription of a reporter gene in a hepatocyte specific manner) and (2) the desired property is 
a different promoter activity (e.g., the ability to drive expression in T lymphocytes) or 
combination of activities (e.g., the ability to drive expression in both T lymphocytes and 
hepatocytes, but not pancreatic beta-cells). The generation of a lymphocyte-specific promoter, 
for example, may be carried out by mutating a substrate population comprising a hepatocyte 
promoter operabiy linked to a GFP reporter gene, and carrying out a suitable screen of the 
resulting mutated population. 

The promoter sequences are mutated by random insertion and/or random deletion. 
As described supra, examples of suitable polynucleotides for insertion include random fragments 
from known promoters (e.g., a T-cell or hepatocyte specific promoter, the metallothionein 
promoter, the constitutive adenovirus major late promoter, the dexamethasone-inducible MMTV 
promoter, the SV40 promoter, the MRP polIII promoter, the constitutive MPSV promoter, the 
constitutive CMV promoter, and promoter-enhancer combinations known in the art), synthetic 
oligonucleotides constituting modules from known promoters, random sequence polynucleotides, 
and other sequences. In embodiments in which there is more than one round of mutation! 
different polynucleotides may be inserted at different steps. For example, the substrate 
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amplification of .he promoter sequence and insertion of the product) as a cassette into a pristine 
vector compnsing a.reporter gene. A variety of strategies will be apparent to one of skill 
following the guidance of this disclosure. 

Exemplary Application 2: n. mp in ff ,n P nmrmir VtMlj _ 

In some embodiments of the invent™, the substrate population is a population 
ofDNA segments encoding a polypeptide with an enzymatic activ.ty and the desired property 
« a new enzymatic activity. I„ one embodiment, the substrate DNA segments encode a 
polypeptide with P-galactosidase activity, and the different enzyme specificity des.red is 
fucostdase activity. Recursive rounds of mutation by alternative deletions (of 5-20 basepairs) 
and msertions (from a library of random hexamers) can be combined with a screen as 
described in Zhang e. al., ,997, Proc. NaflAcat. Sci. 94:4504. As noted supra, it, cases in 
whtch protein codmg DNAs are mutated i, will often be des,rable to use mutation methods 
•hat retam the existing reading frame (e.g., deletion and/or insertion of a multiple of 3 
nucleotide bases), although, if desired, non-functional frame-shift mutants can be e.iminated 
during the screening step. 

Exemplary Application 3: Oianm^f ronertv of „n t^rrt RNfl 

The methods of the invention may be used to evolve a regulatory element (or 
other region) of an RNA encoded by the DNA segment. For example, RNA stability 
elements are known which confer increased stability on mRNAs with which they are 
Physically associated (e.g., encoded downstream of the protein coding sequence). Thus in 
one embodiment of the invention, the substrate population is a population ofDNA segments 
that encode mRNA, and the desired property is increased mRNA stability. 

The evolution of a mRNA-encoding sequence to encode a more stable RNA is 
accomplished by randomly inserting DNA sequences into a substrate population encoding an 
mRNA, and screening or selecting for high levels of expression of the protein (because 
generally, expression of the protein product of the gene is proportional to the mRNA 
stab.hty) or directly assaying the expression level of the mRNA. In one embodiment the 
mserted sequences are fragments (e.g., defined or random fragments) ofDNA sequences from 
known stability elements (Chan et al., 1998, Proc. Nail Acad. Sci. 95:643-6547; Russell et 
al., 1998, Mol. Cell. Biol. 18:2173-2183). 
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In one embodiment, the increased gene expression in the mutated population is 
detected and the resulting set of clones (or pools of 2-20 clones having the highest mRNA 
stability), i.e., the selected population, is used in shuffling or, as a target population for 
additional mutation. The additional mutation can include insertion of additional downstream 
mRNA stability conferring fragments (the same as or different from those inserted in earher 
steps), deletion and screening for increased mRNA stability, or the insertion of different 
sequences (e.g., to confer a different selectable property on the RNA-encoding DNA 
segment). 

Exemplary Application 4: A ddition of a Fun c tional Pommn to a Clonin gs 
f^prresinn Vector 

In this example, the DNA segments of the substrate population are cloning vectors 
which may be procaryotic, eukaryotic, or shuttle vectors, and which may be characterized vectors 
(e g pUC18) or uncharacterized vectors. Examples of vectors include artificial chromosomes, 
plasmids, episomes, viruses, bacteriophages, and mobile elements (e.g. transposons, insertional 
elements). It is often desirable to add a new functional domain or element to a vector by 
inserting acassette encoding a polypeptide (e.g., encoding a resistance marker or novel gene of 
interest), regulatory element, combinations of genes and regulatory elements, or other functional 
or structural elements. However, often the optimal location for insertion is not known. It is 
especially difficult to design vectors with particular or optimal properties when the vectors are 
complex (eg., human papilloma virus and other eukaryotic viruses) or intended for use in 
relatively uncharacterized species of fungi, plants, bacteria (e.g. Streptomycetes), etc.. By 
inserting the function domain, or a fragment thereof, in a random manner, screening the resultant 
mutant population and optimizing the desired property(s) by recursive insertion/deletion 
mutation (and, optionally, shuffling), it is possible to efficiently generate vectors w,th novel and 

optimized properties. 

In one embodiment, an expression cassette (e.g. GFP under control of the E. col, 
/acpromoter) is inserted into random positions of the pool of a mixture of randomly linearized 
vectors (e.g., a pool of P UC19, pETU, P BR322, and P BAD24). Following transformat.on into 
host cells (e.g., E coli) the expression of the protein is assayed (e.g., as assessed by its activity, 
e g green fluorescence for GFP), and the clones expressing the highest levels of the reporter 
gene when induced by IPTG or arabinose are identified and isolated (see, e.g.. Cramer, et al., 
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. 1996, Nature Biotech. 14:315-319). DNA shuffling and further screening is carried out. The 
resulting product is a vector comprising the GFP structural gene positioned in a particular vector 
backbone at a posit.on that provides the best expression properties of the protein. 

Exemplary Application 5: BuildinP an On.ron Pnnf^ „ M ., Mrnjr Phrr1nrjTr 
ojLCeils. 

In another example, the methods of the invention are used to generate a bacterial 
operon encoding several coding sequences (e.g., genes encoding proteins active in a particular 
metabohc pathway). Thus, in one embodiment, the coding sequences for each of the 
polypepfdes (e.g., enzymes) to be expressed is inserted in a stepwise fashion (e.g., as outlined 
m F.gure 1) into a vector comprising one or more promoters able to drive transcription of the 
polypeptide coding sequences. After each insertion step, a screen is carried out for cells 
ophmally expressing the phenotype conferred by the inserted polypeptide(s). The resulting 
mulrigenic operon comprises each of the polypeptide sequences positioned relative to each other 
regulatory elements, and other vector elements in positions that result in optimal expression (or 
other selected-for properties). 

Exemplary Application 6: Insertion of m Affinity Srl rrhbte T„p intn . P^ TTridr 
In another example, a cassette encoding an affinity selectable tag is randomly 
inserted into a substrate population of DNA segments that comprise a polypeptide coding 
sequence, resulting in mutant polypeptides that retain biological activity and have acquired the 
ability to be affinity selected. The addition of an affinity selectable tag to a biologically active 
protein is useful for, e.g., protein purification. 

Examples of sequences that can be randomly inserted into the polypeptide coding 
sequence of the substrate population include polynucleotides encoding affinity selectable oligo- 
or polypeptide sequences (e.g., peptide epitopes recognized by an immunoglobulin), anti- 
antibody fragments (e.g., Vaughan et al., 1996, Nat. Biotech. 14:309-314) and others well known 
in the art. Following insertion, the mutated population is screened and/or selected by a 
combination assays: typically one assay identifies mutant polypeptides that include the affinity 
selectable sequence and a second assay identifies polypeptides that have a second biological 
property (such as the ability to encode a catalytically active enzyme). Screening for affinity 
(affinity selection) may be carried out by any suitable method, such as affinity chromatography, 
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immunoprecipitation, efc. In some embodiments, a phage display system is used for affinity 
enrichment. In such systems, the encoded oligo- or polypeptide is presented on the surface of 
a cell virus or bacteriophage where it is susceptible to binding by the affinity partner (see e.g., 
Ernst et al., 1998, Nucleic Acids Res. 26:17.8-1723; and U.S. Patent Nos. 5,223,409 and 
5,403,484). 

Exemplary Application 7: Production of Protein Vaccines 

The production of protein vaccines is very often limited by the inefficient 
expression of the antigenic protein or inefficient processing of the antigen for presentation on 
MHC complexes. This can be overcome by insertion of one or several ep.tope sequences from 
the antigen into a well expressed or efficiently processed protein. Thus, in one approach, 
multiple T-cell and/or B-cell epitopes are inserted into a known protein "scaffold." In one 
embodiment, the present invention is used to produce effective vaccines by the insertion of 
immunodominant T-cell and B-cell epitopes of an immunogenic protein in the scaffold of a 

highly expressible protein. 

In an exemplary embodiment, a known B-cell epitope from HTV gpl20 is inserted 
into a human scFv protein (Vaughan et al., 1996 Nature Biotechnology 14:309-314) and 
expressed in E. coli. The presence of the B-cell epitope in the chimeric protein is screened for 
as described in copending USSN 09/021769 and 60/074,294. Positive clones (i.e., from the 
selected population) are pooled and all positive clones are used for the next round of insert™ 
of additional B-cell epitopes and/or T-cell epitopes. DNA shuffling is carried out using DNA 
from individual clones. The resulting polypeptide comprises multiple well-expressed and well- 
processed immunogenic peptides and is useful as a vaccine. 

IX. EXAMELES 

The following examples are provided to illustrate the practice of the invention. 
EXAMPLE I, 

Syntnf ^ Pf . Rartprj nl V~-«™ retaining a New Repulrtnhlc Promoter 

This example demonstrates the use of the invention to produce a vector with 
novel properties. Beginning with a known vector (pAK400-GFP) capable of expressing green 
fluorescent protein (GFP), a process including two cycles of random insertion/deletion 
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mutation and selections screening are used to produce a panel of novel vectors. The new 
vectors have new (compared to the parental vector) desired properties with respect to 
tetracycline resistance, inducibility, and GFP expression levels. 

A ) Synthesis of Randomly [,i n eari7ftri V AWdO(].QY? 

The parental vector pAK400-GFP is based on the pAK400 vector (Krebber et 
al., 1997, J. Immunol Meth. 20 1 :35-55), but is modified by replacement of sequences encoding 
the tet R (tetracycline resistance) gene with the coding sequence for green fluorescent protein 
(GFP). To construct pAK400-GFP, GFP is PCR amplified by primers "GFP.For" and 
GFP.Rev" from pBADGFP cycle 3 (Crameri et al., 1996, Nature Biotech. 14:315-319) and 
cloned by Ndel and HindRl in a three fragment ligation into a Ndel and Hindm vector 
fragment of pAK400, resulting in "pAK400-GFP." In pAK400-GFP, expression of GFP is 
under the control of the lac promoter and is inducible by isopropylthiogalactoside (IPTG). The 
vector also contains an E coli pUC derived ColEl origin of replication, a lad gene for the 
expression of the lac repressor in order to repress the lac promoter efficiently, an fl origin for 
packaging of single stranded DNA in phagemids, and the gene for chloramphenicol acetyl 
transferase which confers resistance to chloramphenicol (Cam R ). 

Supercoiled pAK400-GFP is prepared in E. coli by CsCl/ethidium bromide 
equilibrium centrifugation according to standard procedures (e.g., Sambrook et al., supra). The 
vector is linearized by random cleavage by treatment with DNAse I in the presence of ethidium 
bromide, as described in Chaudry et al., 1995, Nucleic Acids. Res. 23:3805-3809. Following 
phenol/chloroform extraction, the once randomly nicked vector is treated with SI nuclease at 
low pH to cleave opposite the single stranded nick (Chaudry et al., supra). The randomly 
linearized vector is extracted using phenol/chloroform, precipitated and treated with a 
polymerase (to ensure the DNA is blunt ended) and with alkaline phosphatase (to 
dephosphorylate the linearized molecules to prevent self-ligation). Finally the linearized (i.e., 
once cleaved) molecule is purified on a 5% polyacryiamide gel or by CsCl/ethidium bromide 
equilibrium centrifugation (Sambrook et al., supra). 

B ) S ynthesis Of tCtR polynucleotides for raH 0 m insfirtinn 

The tetRA operon containing the tet R (tetracycline resistance) gene of TnlO 
(Schollmeier et al., 1 984, J. Bacteriol. 160:499-503) is PCR amplified from pAK400 (Krebber 
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et al, 1997, J. Immunol. Meth. 201:35-55) using the phosphorylated primers Tet.For and 
Tet.Rev and a proof-reading polymerase (Pfu\ Stratagene). 

C) T n^ing ran domly the tet operon into n AK400-GFP 

The blunt ended products of (A) and .(B), supra, are ligated to each other 
according to standard procedures (Sambrook et al., supra). 

D ) spwtin g for tfitpr.vr.iinp and c hloramphenicol r esist a nce and scrrening for inriucibi lily. 
of GFP hv IPTG 

The ligation reaction of step (C) is transformed into an E. coli K12 strain. The 
transformed cells are plated and selected on LB agar containing chloramphenicol, tetracycline 
and IPTG ("IPTG plates' 1 ). After growth overnight at 37°C, colonies are selected on the basis 
of green fluorescence upon exposure to UV light (Crameri et al., 1996, Nature Biotech. 14:315- 
319), indicating expression of GFP. The GFP-expressing colonies are replica plated onto agar 
plates containing chloramphenicol, tetracycline, and 2% glucose ("glucose plates") and assayed 
for GFP expression (by inspection under UV irradiation). DNA is prepared from 100 colonies 
that express GFP on IPTG plates (initial plating) but not on glucose plates (replica plating). 
These DNA segments compromise a population of different (in respect to the position of the 
tefRA-apaon) vectors with the phenotype: CamR, Tet R , IPTG-inducible expression of GFP 
(i.e., IPTG inducible promoter). The vectors in this population may be referred to as pAK400- 
GFP-Tet. As noted supra, the tetR gene is inserted in different positions in different species 
in the population. 

E ) s ynthesis of donb^ «™<\e,<\ oli gonucleotides fr om the tfit regulatory unit of TnlQ 

Non-phosphorylated double-stranded oligonucleotides (the pairs of 
Opl .For/Opl.Rev and Op2.For/Op2.Rev) which encode the two operators of the tnlO promoter 
(Bertrand et al, 1983, Gene 23:149-156) are synthesized chemically. Together the two 
oligonucleotides are referred to as the "tet oligonucleotides." 

Fj nation n f the nli K nnucle f >tiH^ into the liTmarizcri vector nAMOP-OFP and 
swapping of fhr ppwtfer ™ginn into pAK400-GFP-Tet 

In this and the following steps, the tet oligonucleotides are randomly inserted 
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into linearized pAK400' vector (linearized as described for the pAK400-GFP vector in step A, 
supra, but not dephosphorylated) to produce a population of pAK400 vectors containing 
random insertions of the oligonucleotides. Subsequently the (mutated) lac promoter regions 
from the population (containing insertions) .are transferred to the population of pAK400-GFP- 
5 Tet vectors made in step D, supra. 

(An alternative strategy would be to randomly insert into the pAK400-GFP-Tet 
vector population. The strategy used is preferred because it requires screening fewer clones, 
i.e., only clones in which the tet oligonucleotides have inserted at random sites within the lac 
promoter region rather than in other sites in the vector.) 
10 As a first step, the concentration of double stranded tet oligonucleotides is 

optimized by ligating different amounts of oligonucleotide into the randomly linearized vector, 
followed by transformation into an appropriate £. coli K12 strain. After growth overnight at 
37°C, the colonies are counted. The optimal concentration of oligonucleotide is that 
concentration that just decreases the number of colonies. Although optimizing the 
1 5 oligonucleotide concentration will increase efficiency, this step is not critical. 

Having determined the optimal oligonucleotide concentrations for insertion into 
the randomly linearized pAK400 (from above), the double-stranded tet oligonucleotides 
encoding parts of the tet promoter region are inserted into the randomly linearized pAK400 
vector by blunt end ligation. After phenol/chloroform extraction, the resulting ligation is cut 
20 with Kpn\ and Ndel at unique sites flanking the lac promoter of pAK400. The resulting 
fragments containing the lac promoter and a tet promoter oligonucleotide are isolated using 
electrophoresis in a non-denaturing 8% polyacrylamide gel (Sambrook et al., supra). The 
Kpnl-Ndel fragment from pAK400 is 209 bp. When a 20 basepair oligonucleotide is inserted, 
the lac promoter fragment will increase in size to 229 bp. Accordingly, a 229 bp band is 
25 isolated from the non-denaturing gel. The isolated fragment is cloned (ligated) into the 
pAK400-GFP-TET vector pool, which has been Kpnl and Ndel digested. The result is that 
some (though usually not all) of the resulting ligation products will comprise a randomly 
mutated lac promoter (i.e., containing random insertions of the tet promoter oligonucleotide) 
in a pAK400-GFP vector that is also randomly mutated (i.e., by random insertion of tetRA 
30 operon). 
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G ) ci^-rin p far t* .n r j rr- ^ «"d ^irrninP for indncihilitv of GFP bV -JECG 

anH/nr trtracvr.line 

The ligation of step (F) is transformed into an appropriate E. coli K12 strain. 
The transformation is plated and selected on agar plates containing 30^g/ml chloramphenicol, 
S^g/ml tetracycline, and 2% glucose. The colonies are grown overnight at 37°C. 

The recombinants are screened to identify vectors which have different 
promoters. The expression of GFP in the presence and absence of IPTG and/or tetracycline is 
determined as described infra. Tetracycline and chloramphenicol resistant colonies are 
selected by growth in the presence of these two antibiotics. The resistant colonies are replica 
plated on to four different plates. All plates contain chloramphenicol (to select for the Cam R 
of the P AK400 vector backbone). Plate 2 additionally contains IPTG, Plate 3 additionally 
contains tetracycline, and Plate 4 additionally contains tetracycline and IPTG. 

Expression of the GFP reporter gene by colonies is detected by visual or 
electronic observation of green fluorescence of colonies exposed to UV light (Crameri et al„ 
1996, Nature Biotech. 14:315-319). Colonies that express GFP on one plate and not on one 
of the others are regulated by either IPTG and/or tetracycline. Compared to the parental vector 
(which is exclusively regulated by the presence or absence of IPTG) colonies in which GFP 
expression is either increased or decreased by the presence or absence of tetracycline have a 
regulatory function not present in the parent. This screen is able to identify populations of 
vectors with new phenotypes, i.e., Cam*, Tet R , and GFP expression when different 
combinations of tetracycline and IPTG are used. 

The described properties of these vectors may be enhanced further by additional 
rounds of insertion, rounds of deletion, or by shuffling, using the same screen described supra 
(and, e.g., assaying for increased levels of GFP expression) or other screens. 

EXAMPLE n 

ProHurtmr, nf a $-\ o r*™*** Containing an fa Vjvp Biot i nvtatjon Peptide 

This example demonstrates the generation of a high-activity beta-lactamase 
polypeptide that contains an in vivo biotinylation sequence. The beta-lactamase gene is capable 
of conferring ampicillin resistance when expressed in a bacterium; the biotinylation sequence 
may be used to detect or purify a polynucleotide comprising the high-activity beta-lactamase 
polypeptide. This example is illustrative of the creation of a novel multifunctional polypeptide 
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using the techniques of the invention. 

A) The bla gene (encoding beta-lactamase) is PCR amplified from pUCl9 using 
the primers Bla.For and Bla.Rev and subsequently cloned into the Sfil restriction site of 
pAK200 (Krebber et al., 1997, 1 Immunol. Meih. 201:35-55). The resulting vector, 
pAK200SAMP is randomly linearized (but not phosphorylated) as described in Example I, 
supra. 

A double-stranded 90-bp polydeoxyribonucleotide is generated by annealing 
of 90-mers Bio.Rev and Bio. For (encoding a polypeptide having an in vivo biotinylation site 
sequence (Schatz, 1993, Bio/Technology 1 1:1 138-1 143), added in excess, and ligated to the 
randomly linearized pAK200SAMP vector at random positions. The in vivo biotinylation site 
becomes biotinylated when the protein is expressed in E. coli strains which express the 
endogenous biotin holoenzyme synthetase encoded by birk (Barker et al., 1981, J. Mo!. Biol 
146:451-467). 

The pAK200SAMP vector is cleaved with Sfil. The fragment containing the 
bla gene and a 90 bp insertion is identified by size and gel purified by standard methods. The 
fragment including the biotinylation sequence is approximately 896 bp (compared to 
approximately 806 bp without the insert). The purified fragments are cloned into the Sfil site 
of phage display vector pAK200 (Krebber et aL, 1997, supra). After transformation of the 
phagemid library, the bacteria are spread on 2YT-agar plates containing 30Mg/ml 
chloramphenicol and a concentration of ampicillin that reduces the recovery from the 
transformation to 50% of the measured complexity (measured complexity is assessed by 
plating on 2YT-agar containing 30/xg/ml chloramphenicol; hereinafter M 2YT-Cam30" plates). 

After growth overnight at 30°C, the plates are scraped and resuspended in 2YT. 
An aliquot is added to 100 ml 2YT-Cam30 containing the above calculated concentration of 
ampicillin. After coinfection with VCSM13 (Stratagene) according to Krebber et al., 1997, 
supra, and growth, the phages are precipitated and panned in PBS/dialyzed 2% skim milk for 
two to four rounds against streptavidin (Hawkins et al, 1992, J. Moi Biol. 226:889-896) 
immobilized on magnetic beads (Dynal). The binding of single clones to streptavidin is 
verified by phage ELISA (Lindner et al., 1 997, Biotechniques 22: 140-49). These clones (which 
are heterogeneous) are referred to as "pAK200-bla-bio " The combination of the selection on 
ampicillin plates and the panning procedure identifies polynucleotides encoding an active beta- 
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lactamase gene containing a biotinylation sequence. 

B) The expression and beta-lactamase activity of the pAK200-bla-bio produced m 
Section A, supra, is optimized by PCR shuffling (Stemmer, 1994, Nature 370:389-391). To 
5 do this, five to ten pAK200-bla-bio species (clones) are selected based on comparatively high 
beta-lactamase activity (as assessed by conferring on host bacteria resistance to high ampicilhn 
concentrations). The bla-bio insertion is amplified by PCR using Bla.For and BIa.Rev primers. 
According to a standard PCR shuffling protocol (Stemmer, 1994, Nature, supra), the PCR 
products are fragmented randomly by DNAse I, reassembled and cloned into the Sfd sites of 

10 pAK200SAMP. The library is grown overnight at 30°C on 2YT-Agar containing 30^g/ml 
chloramphenicol and a concentration of ampicillin (the "limiting" concentration) which reduces 
the recovery from the transformation to 25% of the measured complexity when grown on plates 
lacking ampicillin. As described supra, the library is scraped from the plates, grown in the 
presence of the limiting concentration of ampicillin, and coinfected with helper phage (supra) 

1 5 to produce phage particles presenting bla-bio fusion insertions. Those phage particles are again 
panned against streptavidin beads (supra). Additional shuffling rounds are carried out using 
selection conditions in which the ampicillin concentration is increased, and temperatures for 
growth, selection and panning are increased to 37°C. This allows the further optimization of 
the bla-bio insertion fusions with respect to activity, biotinylation level, folding and stability. 

20 The fusion(s) with optimal activity can be used for quantitation of streptavidin, e.g., by 
measuring beta-lactamase activity in a sandwich ELISA. 



Table I 

25 Primers, Olig Anndentides Polynucleotides 



GFP. 


For 


AAGGAGATATACATATGGCTAGCAAAGGAGAAG 


GFP. 


,Rev 


TTCACAGGTCAAGCTTCATTATTTGTAGAGCTCATC 


Tet . 


. For 


TTAAGAC CCACTTTCACATTTAAG 


Tet , 


.Rev 


CTAAGCACTTGTCTCCTGTTTAC 


Opl 


.For 


CACTCTATCATTGATAGAGT 


Opl 


.Rev 


ACTC T AT C AATGATAGAGTG 


Op2 


. For 


T C C CTAT C AGTG AT AG AGAA 
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Op 1 . Rev T f CTCTATCACTGATAGGGA 

B la. For . TATTACTCGCGGCCCAGCCGGCCTTTGCTCACCCAGAAAC 

Bl a . Rev TAGAATTCGGCCCCCGAGGCCAATGCTTAATCAGTGA 

Bio . For GGTTCTGAAGGTGGTGGTTCTGCTCAGCGTCTGTTCCACATCCTGG 

ACGCTCAGAAAATCGAATGGCACGGTCCGAAAGGTGGTTCTGGT 
Bio . Rev ACCAGAACCACCTTTCGGACCGTGCCATTCGATTTTCTGAGCGTCC 

AGGATGTGGAACAGACGCTGAGCAGAACCACCACCTTCAGAACC 



*** 

Many modifications and variations of this invention can be made without 
departing from its spirit and scope, as will be apparent to those skilled in the art. The 
specific embodiments described herein are offered by way of example only, and the 
invention is to be limited only by the terms of the appended claims, along with the full 
scope of equivalents to which such claims are entitled. 

All references cited herein are incorporated herein by reference in their 
entirety and for all purposes to the same extent as if each individual publication or patent 
application was specifically and individually indicated to be incorporated by reference in its 
entirety for all purposes. 
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1 4. " A method of producing a DNA segment having a desired property, said 
method comprising;. , , i 4 , . , ; .. f 

a) f mutating a first substrate population, said substrate population comprising a 
plurality of DNA segments; wherein said mutating comprises , 

5 i) making insertions at random sites in said segments, or . - 

ii) making deletions at random sites in said segments;,^ , - _ , 
whereby a first mutated population of mutated DNA segments is produced; 

b) • mutating a second substrate population, said substrate population comprising 
a plurality of DNA segments, wherein said mutating comprises 

10 i) making insertions at random sites in said segments, or 

ii) making deletions at random sites in said segments; 
whereby a second mutated population of mutated DNA segments is produced; 

c) . recombining the first substrate population and the second substrate < 
population, whereby a recombined population is produced; and, 

15 d) screening the recombined population to identify atjeast one.DNA segment 

with the desired property. , y , . . . - . .. , ♦ 

1 5 . The method of claim 14 wherein the first and second mutated populations 
are screened to produce a first and second selected population, each haying a desired 

20 property, and the selected populations are recombined. n ■■ ■ 

16. The method of claim 14, wherein me recombination is.carried out by _ t 
shuffling. , v 

25 17. The method of claim 14, wherein the recombination is directed. 

1 8. The method of claim 1 4, wherein the .first desired property and the second , 
desired property are the same. 
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FIGURE 1 
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FIGURE 2 



First Substrate Population 



Second Substrate Population 



MUTATE BY RANDOM 
INSERTION/DELETION 



I 



MUTATE BY RANDOM 
INSERTION/DELETION 



First Mutated Population 



Second Mutated Population 




RECOMBINE BY 
RESTRICTION/ 
LIGATION OR BY 
SHUFFLING 



Optionally Carry Out Additional 
Mutation/Recombination and 
Screening/Selection Steps 



SCREEN AND 
SEPARATE DMA 
SEGMENT 
OF INTEREST 



DNA Segment with Desired Property j 



WO 99/65927 



3/3 



PCI7US99/13479 



FIGURE 3 
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